Uni- and bivariate data
visualizations using R
 

SIMP59: Data Selection and Visualisation
7.5 credits VT25

nils.holmberg@iko.lu.se

Canvas info

This lecture recaps RMarkdown notebooks and the use of dplyr pipes to streamline data analysis workflows. We will explore univariate data visualizations, focusing on how to effectively visualize distributions using ggplot2. Participants will learn how to construct ggplot2 calls to create clear and informative plots, including bar charts for categorical variables and histograms for numerical variables. The session will cover the concept of frequency in data representation and demonstrate how to interpret distribution patterns to gain insights from data.

We will also explore bivariate data visualizations to analyze relationships between variables. We will cover techniques for visualizing relationships between one numerical dependent variable and one categorical independent variable, as well as methods for comparing two categorical or two numerical variables. Participants will learn how to represent amounts and proportions effectively and use x–y plots to examine trends and correlations. The session will also address visualizing uncertainty in data and best practices for interpreting variability in relationships. Finally, we will demonstrate how to save plots for reporting and presentation purposes.

Course literature

Wickham, Çetinkaya-Rundel, and Grolemund (2023)

Wilke (2019)

Watt and Naidoo (2025)

Lecture overview

  • rmarkdown notebooks, dplyr pipes
  • univariate data visualizations
  • 1.3 ggplot2 calls
  • 1.4 Visualizing distributions
  • 1.4.1 A categorical variable
  • 1.4.2 A numerical variable
  • frequency
  • 5.2 Distributions
  • histograms
  • bivariate data visualizations
  • 1.5 Visualizing relationships
  • one numeric (dv) one categorical (iv)
  • 1.5.2 Two categorical variables
  • 1.5.3 Two numerical variables
  • 5.1 Amounts
  • 5.3 Proportions
  • 5.4 x–y relationships
  • 5.6 Uncertainty
  • 1.6 Saving your plots

whole

A diagram displaying the data science cycle: Import -> Tidy -> Understand  (which has the phases Transform -> Visualize -> Model in a cycle) -> Communicate. Surrounding all of these is Program Import, Tidy, Transform, and Visualize is highlighted.

Figure 1: In this section of the book, you’ll learn how to import, tidy, transform, and visualize data.

rmarkdown, scripts

import

20 Spreadsheets 21 Databases 22 Arrow 23 Hierarchical data

transform

12 Logical vectors 13 Numbers 14 Strings 15 Regular expressions 16 Factors 17 Dates and times 18 Missing values 19 Joins

figure, pivot

A diagram showing how `pivot_longer()` transforms a simple data set, using color to highlight how column names ("bp1" and "bp2") become the values in a new `measurement` column. They are repeated three times because there were three rows in the input.

Figure 2: The column names of pivoted columns become values in a new column. The values need to be repeated once for each row of the original dataset.

Palmer Penguins

test

Quantitative methods

    1. Experiments and
      Threats to Validity
    1. Survey Research,
      Questionnaire
    1. Quantitative
      Content Analysis

Lectures and workshops

Data collection (nov 12)

    1. Concept Explication and Measurement
    1. Reliability and Validity
    1. Effective ­Measurement
    1. Sampling
    1. Content Analysis

Exam question 1

Data analysis (nov 26)

    1. Experiments and Threats to Validity
    1. Survey Research
    1. Descriptive Statistics
    1. Inferential Statistics
    1. Multivariate Statistics

Exam question 2

9. Experiments and Threats to Validity

  • Random Assignment (p. 225)
  • Between-Subjects Design (p. 227)
  • Within-Subjects Design (p. 228)
  • Treatment Groups (p. 233)
  • Stimulus (p. 233)
  • Control Group (p. 238)

Next steps

Workshop 2, dec 2

References

Watt, H., and T. Naidoo. 2025. “Data Wrangling Recipes in r.” https://bookdown.org/hcwatt99/Data_Wrangling_Recipes_in_R/#why-data-wrangling-recipes-in-r.
Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science. 2nd ed. "O’Reilly Media, Inc.". https://r4ds.hadley.nz/.
Wilke, Claus O. 2019. Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures. O’Reilly Media. https://clauswilke.com/dataviz/.